- 
                Notifications
    You must be signed in to change notification settings 
- Fork 460
Upgrade node delete timeout #5706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade node delete timeout #5706
Conversation
| This is [WIP] while it includes a hack to run more tests and changes from #5704. | 
| Codecov ReportAll modified and coverable lines are covered by tests ✅ 
 Additional details and impacted files@@           Coverage Diff           @@
##             main    #5706   +/-   ##
=======================================
  Coverage   52.84%   52.84%           
=======================================
  Files         278      278           
  Lines       29610    29610           
=======================================
  Hits        15647    15647           
  Misses      13146    13146           
  Partials      817      817           ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
 | 
| spec: | ||
| machineTemplate: | ||
| nodeDeletionTimeout: 60s | ||
| nodeDeletionTimeout: 600s | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking this timeout was starting when CAPI first tries to delete the Node, but it actually starts as soon as the Machine is deleted. So this has to account for cordoning and draining the Node and deleting the infra.
| /test pull-cluster-api-provider-azure-e2e-workload-upgrade | 
| Linux machine pool bootstrap flake. No logs so I can't really tell exactly what happened :/ Got past upgrading the control plane though which is what this is attempting to fix. /test pull-cluster-api-provider-azure-e2e-workload-upgrade | 
| One more run and I'm convinced. /test pull-cluster-api-provider-azure-e2e-workload-upgrade | 
b4ed31b    to
    549420a      
    Compare
  
    | This is cleaned up and ready for review. /retitle Upgrade node delete timeout | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
| LGTM label has been added. Git tree hash: 02c8d7dcf96b1c1ef2bd794598b3e518cccb37c1 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
| [APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: willie-yao The full list of commands accepted by this bot can be found here. The pull request process is described here 
Needs approval from an approver in each of these files:
 
 Approvers can indicate their approval by writing  | 
| /retest | 
| @nojnhuh: The following test failed, say  
 Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. | 
| /retest | 
| /cherry-pick release-1.20 | 
| @nojnhuh: #5706 failed to apply on top of branch "release-1.20": In response to this: 
 Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. | 
What type of PR is this?
/kind flake
What this PR does / why we need it:
This PR adds a
nodeDeletionTimeoutto control planes nodes in workload-upgrade tests to increase the tolerance for failed node deletions.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #5705
Special notes for your reviewer:
TODOs:
Release note: